Improving Bayesian Reinforcement Learning Using Transition Abstraction
نویسنده
چکیده
Bayesian Reinforcement Learning (BRL) provides an optimal solution to on-line learning while acting, but it is computationally intractable for all but the simplest problems: at each decision time, an agent should weigh all possible courses of action by beliefs about future outcomes constructed over long time horizons. To improve tractability, previous research has focused on sparsely sampling possible courses of action that are most relevant to computing value; however, sampling alone does not scale well to larger environments. In this paper, we investigate whether an abstraction called projects— parts of the transition dynamics that bias the look ahead to areas of the environment that are promising—can scale up BRL to larger environments. We modify a sparse sampler to incorporate projects. We test our algorithm on standard problems that require effective exploration–exploitation balance and show that learning can be significantly sped up compared to a simpler BRL and the classic Q-learning.
منابع مشابه
Using trajectory data to improve bayesian optimization for reinforcement learning
Recently, Bayesian Optimization (BO) has been used to successfully optimize parametric policies in several challenging Reinforcement Learning (RL) applications. BO is attractive for this problem because it exploits Bayesian prior information about the expected return and exploits this knowledge to select new policies to execute. Effectively, the BO framework for policy search addresses the expl...
متن کاملLearning Qualitative Markov Decision Processes Learning Qualitative Markov Decision Processes
To navigate in natural environments, a robot must decide the best action to take according to its current situation and goal, a problem that can be represented as a Markov Decision Process (MDP). In general, it is assumed that a reasonable state representation and transition model can be provided by the user to the system. When dealing with complex domains, however, it is not always easy or pos...
متن کاملProceedings of the ICML / UAI / COLT Workshop on Abstraction in Reinforcement Learning
Bayesian Reinforcement Learning (BRL) provides an optimal solution to on-line learning while acting, but it is computationally intractable for all but the simplest problems: at each decision time, an agent should weigh all possible courses of action by beliefs about future outcomes constructed over long time horizons. To improve tractability, previous research has focused on sparsely sampling p...
متن کاملA Core Task Abstraction Approach to Hierarchical Reinforcement Learning: (Extended Abstract)
We propose a new, core task abstraction (CTA) approach to learning the relevant transition functions in model-based hierarchical reinforcement learning. CTA exploits contextual independences of the state variables conditional on the taskspecific actions; its promising performance is demonstrated through a set of benchmark problems.
متن کاملSelf-Organizing Perceptual and Temporal Abstraction for Robot Reinforcement Learning
A major current challenge in reinforcement learning research is to extend methods that work well on discrete, short-range, low-dimensional problems to continuous, highdiameter, high-dimensional problems, such as robot navigation using high-resolution sensors. We present a method whereby an robot in a continuous world can, with little prior knowledge of its sensorimotor system, environment, and ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009